UNN-WePS: Web Person Search using co-Present Names and Lexical Chains
نویسندگان
چکیده
We describe a system, UNN-WePS for identifying individuals from web pages using data from Semeval Task 13. Our system is based on using co-presence of person names to form seed clusters. These are then extended with pages that are deemed conceptually similar based on a lexical chaining analysis computed using Roget’s thesaurus. Finally, a single link hierarchical agglomerative clustering algorithm merges the enhanced clusters for individual entity recognition. UNN-WePS achieved an average purity of 0.6, and inverse purity of 0.73.
منابع مشابه
Person Name Disambiguation on the Web Using Query Expansion
The more important the web search become, the bigger the same name problem in the web search. Proposed solution is forming clusters of people from search results. In this paper, we report our algorithms that disambiguates person names in web search results. Our clustering algorithm is based on hierarchical agglomerative clustering using named entities, compound key words and URLs as features fo...
متن کاملExploiting Web querying for Web People Search in WePS2
Searching for people on the Web is one of the most common query types to the web search engines today. However, when a person name is queried, the returned result often contains webpages related to several distinct namesakes who have the queried name. The task of disambiguating and finding the webpages related to the specific person of interest is left to the user. Many Web People Search (WePS)...
متن کاملPerson Name Disambiguation on the Web by Two-Stage Clustering
The more important web searching becomes, the more we have to focus on the “same name” problem in web searches. In this paper, we report our algorithm for disambiguating person names in web search results. It is a document clustering algorithm based on hierarchical agglomerative clustering using named entities, compound keywords, and URLs as features for document similarity calculation. We prop...
متن کاملAutomatic Detection of Name Disambiguation and Extracting Aliases for the Personal Name
An individual can be referred by multiple name aliases on the web. Extracting aliases of a name is important in information retrieval, sentiment analysis and name disambiguation. We propose a novel approach to find aliases of a given name using automatically extracted lexical pattern based approach. We exploit set of known names and their aliases as training data and extract lexical patterns th...
متن کاملAutomatic Discovery of Lexical Patterns using Pattern Extraction Algorithm to Identify Personal Name Aliases with Entities
The personal name aliases are extremely significant in information retrieval to retrieve complete information about a personal name from the web, as some of the web pages of the person may also be referred by his or her alias name / nick name / real name. There is a rapid growth in people searching where the personal name aliases are concerned. We proposed a pattern generator which includes aut...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007